In software engineering, a project fork happens when developers take a legal copy of source code from one software package and start independent development on it, creating a distinct piece of software. The term implies not merely a development branch, but a split in the developer community, analogous to a religious schism.
Free and open source software is that which, by definition, may be forked from the original development team without prior permission without violating any copyright law. However, licensed forks of proprietary software (e.g. Unix) also happen.
Contents |
The term "fork" was first used in the sense of "branch" by Eric Allman in 1980, to describe forming branches in sccs:
The term was in use on Usenet by 1983 for the process of creating a subgroup to move topics of discussion to.[2]
"Fork" is not known to have been used in the sense of a community schism during the origins of Lucid Emacs (now XEmacs) (1991) or the BSDs (1993-4); Russ Nelson used the term "shattering" for this sort of fork in 1993, attributing it to John Gilmore.[3] However, "fork" was in use in the present sense by 1995 to describe the XEmacs split[4], and was an understood usage in the GNU Project by 1996.[5]
Free and open source software may be legally forked without the approval of those currently managing a software project or distributing the software, per the definitions of "free software" ("Freedom 3: The freedom to improve the program, and release your improvements to the public, so that the whole community benefits") and "open source" ("3. Derived Works: redistribution of modifications must be allowed. (To allow legal sharing and to permit new features or repairs.)").[6]
In free software, forks often result from a schism over different goals or personality clashes. In a fork, both parties assume nearly identical code bases but typically only the larger group, or whoever controls the web site, will retain the full original name and the associated user community. Thus, there is a reputation penalty associated with forking.[6] The relationship between the different teams can be cordial or very bitter.
Eric S. Raymond, in his essay Homesteading the Noosphere,[7] stated that "The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community". He notes in the Jargon File:[8]
David A. Wheeler notes[6] four possible outcomes of a fork, with examples:
More recently, distributed revision control (DVCS) tools have popularised a less emotive use of the term "fork", blurring the distinction with "branch". With a DVCS such as Mercurial or Git, the normal way to contribute to a project is to first branch the repository, and later seek to have your changes integrated with the main repository. Sites such as Github, Bitbucket and Launchpad provide free DVCS hosting expressly supporting independent branches, such that the technical, social and financial barriers to forking a source code repository are massively reduced.
Forks often restart version numbering from 0.1 or 1.0 even if the original software was at version 3.0, 4.0, or 5.0. An exception is when the forked software is designed to be a drop-in replacement of the original project, in which case, for example, forked version 5.2 is compatible with version 5.2 of the original software (as it happens in the case of MariaDB and MySQL as of 2011).[9]
In proprietary software, the copyright is usually held by the employing entity, not by the individual software developers. Proprietary code is thus more commonly forked when the owner needs to develop two or more versions, such as a windowed version and a command line version, or versions for differing operating systems, such as a wordprocessor for IBM PC compatible machines and Macintosh computers. Generally, such internal forks will concentrate on having the same look, feel, data format, and behavior between platforms so that a user familiar with one can also be productive or share documents generated on the other. This is almost always an economic decision to generate a greater market share and thus pay back the associated extra development costs created by the fork.
A notable proprietary fork not of this kind is the many varieties of proprietary Unix — almost all derived from AT&T Unix and all called "Unix", but increasingly mutually incompatible.[10] See UNIX wars.
The BSD licenses permit forks to become proprietary software, and some say that commercial incentives thus make proprietisation almost inevitable. Examples include Mac OS X (based on the proprietary Nextstep and the open source FreeBSD), Cedega and CrossOver (proprietary forks of Wine, though CrossOver tracks Wine and contributes considerably), EnterpriseDB (a fork of PostgreSQL, adding Oracle compatibility features), Fujitsu Supported PostgreSQL with their proprietary ESM storage system, and Netezza's proprietary highly scalable derivative of PostgreSQL. Some of these vendors contribute back changes to the community project, while some keep their changes as their own competitive advantages.